Nonlinear regression in environmental sciences by support vector machines combined with evolutionary strategy

نویسندگان

  • Aranildo R. Lima
  • Alex J. Cannon
  • William W. Hsieh
چکیده

A hybrid algorithm combining support vector regression with evolutionary strategy (SVR-ES) is proposed for predictive models in the environmental sciences. SVR-ES uses uncorrelated mutation with p step sizes to find the optimal SVR hyper-parameters. Three environmental forecast datasets used in the WCCI-2006 contest – surface air temperature, precipitation and sulphur dioxide concentration – were tested. We used multiple linear regression (MLR) as benchmark and a variety of machine learning techniques including bootstrap-aggregated ensemble artificial neural network (ANN), SVR-ES, SVR with hyper-parameters given by the Cherkassky-Ma estimate, the M5 regression tree, and random forest (RF). We also tested all techniques using stepwise linear regression (SLR) first to screen out irrelevant predictors. We concluded that SVR-ES is an attractive approach because it tends to outperform the other techniques and can also be implemented in an almost automatic way. The Cherkassky-Ma estimate is a useful approach for minimizing the mean absolute error and saving computational time related to the hyper-parameter search. The ANN and RF are also good options to outperform multiple linear regression (MLR). Finally, the use of SLR for predictor selection can dramatically reduce computational time and often help to enhance accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Determination of 137Ba Isotope Abundances in Water Samples by Inductively Coupled Plasma-optical Emission Spectrometry Combined with Least-squares Support Vector Machine Regression

A simple and rapid method for the determination of 137Ba isotope abundances in water samples by inductively coupled plasma-optical emission spectrometry (ICP-OES) coupled with least-squares support vector machine regression (LS-SVM) is reported. By evaluation of emission lines of barium, it was found that the emission line at 493.408 nm provides the best results for the determination...

متن کامل

Using a multi-objective genetic algorithm for SVM construction

Orazio Giustolisi Engineering Faculty of Taranto, Technical University of Bari, via Turismo no 8, Paolo VI, 74100 Taranto, Italy E-mail: [email protected]; [email protected] Support Vector Machines are kernel machines useful for classification and regression problems. In this paper, they are used for non-linear regression of environmental data. From a structural point of view, Sup...

متن کامل

A comparative study of performance of K-nearest neighbors and support vector machines for classification of groundwater

The aim of this work is to examine the feasibilities of the support vector machines (SVMs) and K-nearest neighbor (K-NN) classifier methods for the classification of an aquifer in the Khuzestan Province, Iran. For this purpose, 17 groundwater quality variables including EC, TDS, turbidity, pH, total hardness, Ca, Mg, total alkalinity, sulfate, nitrate, nitrite, fluoride, phosphate, Fe, Mn, Cu, ...

متن کامل

Mining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM

Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...

متن کامل

کاربرد الگوریتم‌های داده‌کاوی در تفکیک منابع رسوبی حوزۀ آبخیز نوده گناباد

Introduction: Reduction of sediment supply requires the implementation of soil conservation and sediment control programs in the form of watershed management plans. Sediment control programs require identifying the relative importance of sediment sources, their quantitative ascription and identification of critical areas within the watersheds. The sediment source ascription is involves two...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & Geosciences

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2013